Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

نویسندگان

  • Daniel J. Mankowitz
  • Timothy Arthur Mann
  • Shie Mannor
چکیده

Reinforcement Learning (RL) aims to learn an optimal policy for a Markov Decision Process (MDP). For complex, high-dimensional MDPs, it may only be feasible to represent the policy with function approximation. If the policy representation used cannot represent good policies, the problem is misspecified and the learned policy may be far from optimal. We introduce IHOMP as an approach for solving misspecified problems. IHOMP iteratively refines a set of specialized policies based on a limited representation. We refer to these policies as policy threads. At the same time, IHOMP stitches these policy threads together in a hierarchical fashion to solve a problem that was otherwise misspecified. We prove that IHOMP enjoys theoretical convergence guarantees and extend IHOMP to exploit Option Interruption (OI) enabling it to learn where policy threads can be reused. Our experiments demonstrate that IHOMP can find near-optimal solutions to otherwise misspecified problems and that OI can further improve the solutions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An iterative method for tri-level quadratic fractional programming problems using fuzzy goal programming approach

Tri-level optimization problems are optimization problems with three nested hierarchical structures, where in most cases conflicting objectives are set at each level of hierarchy. Such problems are common in management, engineering designs and in decision making situations in general, and are known to be strongly NP-hard. Existing solution methods lack universality in solving these types of pro...

متن کامل

A Hierarchical Production Planning and Finite Scheduling Framework for Part Families in Flexible Job-shop (with a case study)

Tendency to optimization in last decades has resulted in creating multi-product manufacturing systems. Production planning in such systems is difficult, because optimal production volume that is calculated must be consistent with limitation of production system. Hence, integration has been proposed to decide about these problems concurrently. Main problem in integration is how we can relate pro...

متن کامل

A New Iterative Algorithm for Multivalued Nonexpansive Mappping and Equlibruim Problems with Applications

In this paper, we introduce two iterative schemes by a modified Krasnoselskii-Mann algorithm for finding a common element of the set of solutions of equilibrium problems and the set of fixed points of multivalued nonexpansive mappings in Hilbert space. We prove that the sequence generated by the proposed method converges strongly to a common element of the set of solutions of equilibruim proble...

متن کامل

Hierarchical Approach to Evolutionary Multi-Objective Optimization

In this paper a new “hierarchical” evolutionary approach to solving multi-objective optimization problems is introduced. The results of experiments with standard multi-objective test problems, which were aimed at comparing “hierarchical” and “classical” versions of multiobjective evolutionary algorithms, show that the proposed approach is a very promising technique.

متن کامل

An Iterative Scheme for Generalized Equilibrium, Variational Inequality and Fixed Point Problems Based on the Extragradient Method

The problem ofgeneralized equilibrium problem is very general in the different subjects .Optimization problems, variational inequalities, Nash equilibrium problem and minimax problems are as special cases of generalized equilibrium problem. The purpose of this paper is to investigate the problem of approximating a common element of the set of generalized equilibrium problem, variational inequal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1602.03348  شماره 

صفحات  -

تاریخ انتشار 2016